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Abstract 

We describe a new variation of a mathematical card trick, whose analysis leads 
to new lower bounds for data compression and estimating the entropy of Markov 
sources. 
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Several years ago, an article in the popular press [l[ described the following 
mathematical card trick: the magician gives a deck of cards to an audience 
member, who cuts the deck, draws six cards and lists their colours; the magician 
then says which cards were drawn. The key to the trick is that the magician 
prearranges the deck so that the sequence of the cards' colours is a substring of a 
binary De Bruijn cycle of order six, i.e., so that every sextuple of colours occurs 
at most once. Although the trick calls only for the magician to name the cards 
drawn, he or she could also name the next card, for example, with absolute 
certainty. At the time we ran across the article, we were studying empirical 
entropy, and one way to define the fcth-order empirical entropy of a string s is 
as our expected uncertainty about the character in a randomly chosen position 
when given the preceding k characters 0] . After reading the trick's description, 
it occurred to us that the fcth-order empirical entropy of any De Bruijn cycle 
of order at most fc is 0. Using this and other properties of De Bruijn cycles, 
we were able to prove several lower bounds for data compression [3, For 
example, since c-ary De Bruijn cycles of order k have length o~ k , there are 
(cr!)' 7 '' 1 /a k such sequences [5;] and log 2 ^(cr!) ^ 1 /a k ^j = 0(er fc logcr), a simple 
counting argument proves the following theorem. 

Theorem 1 (Gagie, 2006 |6j). If fc > log^ n then, in the worst case, we cannot 
store a a-ary string s of length n in XnHj- (s) + o(n log a) bits for any coefficient 
A. 

In this paper we consider a variation of the trick described above, that has 
led us to some new bounds. This time, suppose the magician does not bother to 
prearrange the deck, but shuffles it instead and has the audience member draw 
seven cards; after the audience member lists the cards' colours, the magician 
has him or her replace the cards, cut the deck again and return it; the magician 
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examines the deck and says which cards were drawn. It is not hard to show that 
the probability of two septuples of cards having the same colours in the same 
order is at most 1/128 (even if the septuples overlap), so the probability only 
one sextuple has the colours listed is at least 1 — 51/128 > 0.6; thus, simply 
examining the deck gives the magician a better than even chance of guessing the 
cards drawn. Our analysis is slightly pessimistic because the probability of two 
septuples' colours matching would be exactly 1/128 only if they were drawn 
with replacement; drawn without replacement, the probability of two cards' 
colours matching, for example, is 25/51 < 1/2. Also, even if several sextuples 
have the colours listed, the magician still has some chance of guessing correctly 
from amongst them. 

Now suppose we draw the n characters of a string s randomly from an 
alphabet of size a. By the same reasoning as above, the probability two fc-tuples 
match is l/cr k ; by linearity of expectation, the expected number of matches is 
(2) l° k ■ T ne Oth-order empirical entropy of s 

Ho(s) = (1/n) ^ occ(a, s) log 2 (n/occ(a, s)) < log 2 a , 

a 

where occ(a, s) is the number of occurrences of character a in s; the fcth-order 
empirical entropy of s 

= (1/n) |« a |#0(*a), 

H 

where s a is the concatenation of characters immediately following occurrences 
in s of the fc-tuple a. Therefore, calculation shows 

mk(s)} < (1/n) Q log 2 a < (n/a k ) log 2 a , 

which implies the following theorem. 

Theorem 2. If k > (1 + e) log CT n then, in the expected case, we cannot store 
a a-ary string s of length n in XnHk(s) + o(n\oga) bits for any coefficient 
A = o(n e ). 

Proof If k > (1 + e) log^ n and A = o(n e ), then 

E [XnHk (s) + o(n log <r)] = o(n log a) , 

but the expected number of bits needed to store s is O(nloger). □ 

Similarly, by the union bound, the probability there are any matching /c-tuples 
at all in s is at most ( 2 )/c' c , so the probability that Hk(s) = is at least 
1 — (™) I a k , implying the following theorem. 

Theorem 3. If k > (2 + e) log CT n for some positive constant e then, with high 
probability, we cannot store a a-ary string s of length n in XnHk(s) + o(n log a) 
bits for any coefficient A. 
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Proof. If k > (2 + e) log CT n for some positive constant e, then 



XnHk (s) + o(n log a) = o(n log it) 

with probability at least 1 — l/n e = 1 — o(l); however, the number of bits needed 
to store s is 0(n log er) with probability 1 — o(l). □ 

The upper bound above on the probability there are any matching fc-tuples 
also quickly yields an exponential lower bound on the sample complexity of 
estimating the entropy of Markov sources. This stands in contrast to, e.g., the 
Shannon-McMillan-Breiman Theorem (see, e. g., 3) and bounds for estimating 
the entropy of a probability distribution [§l [qL llC| . Although many papers have 



been written about estimating the entropy of a Markov source (see, e.g., 11 1 
and references therein), we know of no previous lower bounds comparable to 
the one below. 

Theorem 4. Suppose a a-ary string s is generated either by a deterministic 
kth-order Markov source (which has entropy 0) or by an unbiased memoryless 
source (which has entropy \og 2 °")- No algorithm can guess the type of source 
with probability at least 2/3 without reading Q(o~ k / 2 ) characters. 

Proof. Suppose there is an algorithm that guesses correctly with probability 
at least 2/3 after reading o(<7 fe//2 ) characters, when they are generated by an 
unbiased memoryless source. By the upper bound above, with high probability 
the string generated will not contain any matching fc-tuples. It follows that we 
can find a particular string s of length o(a k ^ 2 ) containing no matching fc-tuples 
and such that, with probability nearly 2/3, the algorithm classes s as having 
come from an unbiased memoryless source. Since s contains no matching fc- 
tuples, we can build a deterministic fcth-order Markov source that generates s 
with probability 1; on this source, the algorithm errs with probability nearly 
2/3. □ 
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